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Abstract 

Given  the  description  of  a physical  system  in  one 
of  several  forms  (a  set  of  constraints,  Bayesian  net- 
work etc.)  and  a set  of  observations  made,  the 
task  of  model-based  diagnosis  is  to  find  a suitable 
assignment  to  the  modes  of  behavior  of  individ- 
ual components  (this  notion  can  also  be  extended 
to  handle  transitions  and  dynamic  systems  [Kurien 
and  Nayak,  2000],  Many  formalisms  have  been 
proposed  in  the  past  to  characterize  diagnoses  and 
systems.  These  include  consistency-based  diag- 
nosis, fault  models,  abduction,  combinatorial  op- 
timization, Bayesian  model  selection  etc.  Different 
approaches  are  apparently  well  suited  for  different 
applications  and  representational  forms  in  which 
the  system  description  is  available.  In  this  paper, 
we  provide  a unifying  theme  behind  all  these  ap- 
proaches based  on  the  notion  of  model  counting. 

By  doing  this,  we  are  able  to  provide  a universal 
characterization  of  diagnoses  that  is  independent  of 
the  representational  form  of  the  system  description. 

We  also  show  how  the  shortcomings  of  previous  ap- 
proaches (mostly  associated  with  their  inability  to 
reason  about  different  elements  of  knowledge  like 
probabilities  and  constraints)  are  removed  in  our 
framework.  Finally,  we  report  on  the  computational 
tractability  of  diagnosis-algorithms  based  on  model 
counting. 

1 Introduction 

Diagnosis  is  an  important  component  of  autonomy  for  any 
intelligent  agent.  Often,  an  intelligent  agent  plans  a set  of 
actions  to  achieve  certain  goals  and  because  some  conditions 
may  be  unforeseen,  it  is  important  for  it  to  be  able  to  recon- 
figure its  plan  depending  upon  the  state  in  which  it  is.  This 
state  identification  problem  is  essentially  a problem  of  diag- 
nosis. In  its  simplest  form,  the  problem  of  diagnosis  is  to  find 
a suitable  assignment  to  the  modes  of  behavior  of  individual 
components  in  a static  system  (given  some  observations  made 
on  it  ).  It  is  possible  to  handle  the  case  of  dynamic  systems  by 
treating  the  transition  variables  as  components  (in  one  sense) 
[Kurien  and  Nayak,  2000].  The  theory  developed  in  this  pa- 
per is  therefore  equally  applicable  to  dynamic  systems  too 


(although  we  omit  the  discussion  due  to  restrictions  on  the 
length  of  the  paper). 

Many  approaches  have  been  used  in  the  past  to  character- 
ize diagnoses  and  systems.  Among  the  most  comprehensive 
pieces  of  work  are  [de  Kleer  and  Williams,  1989],  [Reiter, 
1987],  [Struss  and  Dressier,  1989],  [Console  et  al.,  1989], 
[de  Kleer  et  al,  1992],  [Poole,  1994],  [Kohlas  et  al .,  1998] 
and  [Lucas,  2001],  The  popular  characterizations  of  diag- 
noses include  consistency-based  diagnosis,  fault  models,  ab- 
duction, combinatorial  optimization,  and  Bayesian  model  se- 
lection. These  approaches  are  however  tailored  for  different 
applications  and  representational  forms  in  which  the  system 
description  is  available.  They  also  have  one  or  more  short- 
comings arising  out  of  their  inability  to  provide  for  a frame- 
work that  can  incorporate  knowledge  in  different  forms  like 
probabilities,  constraints  etc. 

In  this  paper,  we  provide  a unifying  theme  behind  all  these 
approaches  based  on  the  notion  of  model  counting.  By  doing 
this,  we  are  able  to  provide  a universal  characterization  of  di- 
agnoses independent  of  the  representational  form  of  the  sys- 
tem description.  Because  model  counting  bridges  the  gap  be- 
tween different  kinds  of  knowledge  elements,  the  shortcom- 
ings of  previous  approaches  are  removed. 

2 Background 

Before  we  present  our  characterization  of  diagnoses  based  on 
model  counting,  we  choose  to  provide  a quick  overview  of 
the  previous  approaches  so  that  we  can  compare  and  contrast 
our  approach  with  them. 

Definition  (Diagnosis  System)  A diagnosis  system  is  a triple 
(SD,  COMPS,  OBS)  such  that: 

1 . SD  is  a system  description  expressed  in  one  of  several 
forms  — constraint  languages  like  propositional  logic,  prob- 
abilistic models  like  Bayesian  network  etc.  SD  specifies  both 
component  behavior  information  and  component  structure  in- 
formation (i.e.  the  topology  of  the  system). 

2.  COMPS  is  a finite  set  of  components  of  the  system.  A 
component  compi  (1  < i < \COMPS\)  can  behave  in  one 
of  several,  but  finite  set  of  modes  (Mi).  If  these  modes  are 
not  specified  explicitly,  then  we  assume  two  modes  — failed 
(ABieompi))  and  normal  (->AB(compi)). 

3.  OBS  is  a set  of  observations  expressed  as  variable  values. 
Definition  The  task  in  a complete  diagnosis  call  is  to  find  a 
“suitable”  assignment  of  modes  to  all  the  components  in  the 


system  given  SD  and  OBS.  The  task  in  a partial  diagnosis 
call  is  to  find  a suitable  assignment  of  modes  to  a specified 
subset  S ( S C COMPS)  of  the  components  in  the  system 
given  SD  and  OBS. 

Unless  stated  otherwise,  we  will  use  the  term  “diagnosis” 
to  refer  to  a complete  diagnosis.  Later  in  the  paper  we  will 
show  that  the  characterization  of  partial  diagnoses  is  a simple 
extension  of  the  characterization  of  complete  diagnoses. 
Definition  (Candidate)  Given  a set  of  integers 
U • • • i\cOMPS\  (such  that  for  1 < j < \COMPS\ , 
1 < ij  < \Mj\),  a candidate  Cand(i\  ■ ■ -i\coMPS\)  is 

defined  as  Cand(h  ■ ■ ■ i\coMPS\)  = (Ui=?MP5'  ( compk  = 
Mk(ik))). 

Here,  Mu(v)  denotes  the  vth  element  in  the  set  Mu  (assumed 
to  be  indexed  in  some  way). 

Notation  When  the  indices  are  implicit  or  arbitrary,  we  will 
use  the  symbol  H to  denote  a candidate  or  a hypothesis  i.e. 
an  assignment  of  modes  to  all  the  components  in  the  system. 

Consistency-Based  Diagnosis 

The  task  of  consistency-based  diagnosis  can  be  summarized 
as  follows.  Note  that  the  definition  of  a diagnosis  in  this 
framework  does  not  discriminate  between  single  and  multi- 
ple faults. 

Definition  (Consistency-Based  Diagnosis)  A candidate  H is 
a diagnosis  if  and  only  if  SD  U OBS  U H is  satisfiable. 

There  are  other  characterizations  of  diagnoses  under  this 
framework  called  partial  diagnoses,  prime  diagnoses , kernel 
diagnoses  etc.  We  will  examine  these  later  in  the  paper. 

Fault  Models 

Consider  diagnosing  a system  consisting  of  three  bulbs 
B\,B-2  and  B3  connected  in  parallel  to  the  same  volt- 
age source  V under  the  observations  of  f(Bf),  of  f(B2) 
and  on(B3).  AB(V)  A AB(B3 ) is  a diagnosis  under  the 
consistency-based  formalization  of  diagnosis  if  we  had  con- 
straints only  of  the  form  ->  AB(B3)  A ~>AB(V)  — > on(B:>). 
Intuitively  however,  it  does  not  seem  reasonable  because  B3 
cannot  be  on  without  V working  properly.  One  way  to  get 
around  this  is  to  include  fault  models  in  the  system.  These  are 
constraints  that  explicitly  describe  the  behavior  of  a compo- 
nent when  it  is  not  in  its  nominal  mode  (most  expected  mode 
of  behavior  of  a component).  Such  a constraint  in  this  exam- 
ple would  be  AB(B3 ) -*■  of  f(B3).  Diagnosis  can  become 
indiscriminate  without  fault  models.  It  is  also  easy  to  see 
that  the  consistency-based  approach  can  exploit  fault  models 
(when  they  are  specified)  to  produce  more  intuitive  diagnoses 
(like  only  Bi  and  B2  being  abnormal). 

Diagnosis  as  Combinatorial  Optimization 

The  technique  of  using  fault  models  is  associated  with  the 
problem  of  being  too  restrictive.  We  may  not  be  able  to  model 
the  case  of  some  strange  source  of  power  making  B3  on  etc. 
The  way  out  of  this  is  to  allow  for  many  modes  of  behavior 
for  each  component  of  the  system.  Every  component  has  a 
set  of  modes  (in  which  it  can  behave)  with  associated  mod- 
els. One  of  these  is  the  nominal  (or  normal)  mode  and  the 
others  are  fault  modes.  Each  component  has  the  unknown 
fault  mode  with  the  empty  model.  The  unknown  mode  tries 
to  capture  the  modeling  incompleteness  assumption  (obscure 


modes  that  we  cannot  model  in  the  system).  Also,  each  mode 
has  an  associated  probability  that  is  the  prior  probability  of 
the  component  being  in  that  mode.  Diagnosis  can  now  be  cast 
as  a combinatorial  optimization  problem  of  assigning  modes 
of  behavior  to  each  component  such  that  it  is  not  only  con- 
sistent with  SD  U OBS,  but  also  maximizes  the  product  of 
the  prior  probabilities  associated  with  those  modes  (assuming 
independence  in  the  behavior  of  components). 

Definition  (Combinatorial  Optimization)  A candidate  H = 
Cand(ii  • - - i\comps\  ) is  a diagnosis  if  and  only  if  SD  U H U 

OBS  is  satisfiable  and  P(H)  = (ILl^^MPS'P(compk  = 
Mk(ik)))  is  maximized. 

Diagnosis  as  Bayesian  Model  Selection 

Sometimes  we  have  sufficient  experience  and  statistical  in- 
formation associated  with  the  behavior  of  a system.  In  such 
cases,  the  system  description  is  usually  available  in  the  form 
of  a probabilistic  model  like  a Bayesian  network.  Given  some 
observations  made  on  the  system,  the  problem  of  diagnosis 
then  becomes  a Bayesian  model  selection  problem. 
Definition  (Bayesian  Model  Selection)  A candidate  H 
is  a diagnosis  (for  a probabilistic  model  of  the  system, 
SD)  if  and  only  if  it  maximizes  the  posterior  probability 
P(H/SD,  OBS). 

Diagnosis  as  Abduction 

Yet  another  intuition  behind  characterizing  diagnoses  is  the 
idea  of  explanation.  Explanatory  diagnoses  essentially  try  to 
capture  the  notion  of  cause  and  effect  in  the  physics  of  the 
system.  The  observations  are  asymmetrically  divided  into  in- 
puts (J)  and  outputs  ( O ) [de  Kleer  et  al.,  1992].  The  inputs 
(i)  are  those  observation  variables  that  can  be  controlled  ex- 
ternally. 

Definition  (Abductive  Diagnosis)-.  An  abductive  diagnosis 
for  (SD,  COMPS,  OBS  = I U O)  is  a candidate  H such 
that  SD  U I U H is  satisfiable  and  SD  U I U H -4  O. 

3 Probabilities  and  Model  Counting 

Before  we  present  our  own  characterization  of  diagnoses 
based  on  the  notion  of  model  counting,  we  show  an  interest- 
ing relationship  between  probabilities  and  model  counting 
(see  Figure  1).  The  model  counting  problem  is  the  problem 
of  counting  the  number  of  solutions  to  a SAT  (satisfiability 
problem)  or  a CSP  (constraint  satisfaction  problem). 

Definition  (Binary  representation  of  a CPT):  The  bi- 
nary representation  of  a CPT  (Conditional  Probability  Table) 
is  a table  in  which  all  the  floating-point  entries  of  the  CPT 
are  re-written  in  a binary  form  (base  2)  up  to  a precision  of  P 
binary  digits  and  the  decimal  point  along  with  any  redundant 
zeroes  to  the  left  of  it  are  removed. 

We  provide  a set  of  definitions  and  results  relating  the 
probability  of  a partial  assignment  A to  the  number  of 
solutions  (under  the  same  partial  assignment  A)  to  CSPs 
composed  out  of  the  binary  representations  of  the  CPTs  (see 
Figure  1).  Basic  definitions  related  to  CSPs  can  be  found  in 
[Dechter,  1992]. 

Definition  (Zero-one-layer  of  a CPT)  The  klh  zero-one-laver 
of  a CPT  is  a table  of  zeroes  and  ones  derived  from  the  kth 


Figure  1 : Shows  the  conditional  probability  tables  (CPTs)  of  a Bayes  net  on  the  left  of  the  vertical  line  L.  On  the  right  of  L are 
the  binary  representations  of  these  CPTs  (example  shown  for  0.4  in  decimal  = 0.011  in  binary).  CPTs  correspond  to  families  in 
the  Bayes  net  and  let  the  number  of  families  be  C. 


bit  position  of  all  the  numbers  in  the  binary  representation  of 
that  CPT. 

Definition  (Weight  of  a zero-one-layer)  The  klh  zero-one- 
layer  of  a CPT  is  defined  to  have  weight  2~k. 

Definition  (CSP  Compilation  of  a CPT)  The  kth  CSP 
compilation  of  a CPT  is  a constraint  over  the  variables  of  the 
CPT  that  is  derived  from  the  kth  zero-one-layer  of  the  CPT 
such  that  zeroes  correspond  to  disallowed  tuples  and  ones 
correspond  to  allowed  tuples. 

Definition  (CSP  Compilation  of  Network)  The  (/ci , k-i  ■ ■ ■ kc) 
CSP  compilation  of  the  Network  is  the  set  of  constraints  S = 
{«i  : si  is  the  k\h  CSP  compilation  of  the  ith  CPT}. 
Definition  (Weight  of  a CSP  Compilation)  The  weight  of  a 
(ki,k-2  ■ ■ ■ kc)  CSP  compilation  of  a network  is  defined  to  be 
equal  to  2~{-k'-+k*-k°\ 

Property  There  are  an  exponential  number  of  CSP  compi- 
lations for  a given  network.  Since  each  CPT  expands  into 
P zero-one-layers  and  a CSP  for  the  entire  network  can  be 
compiled  by  taking  any  of  these  P layers  for  each  CPT  (there 
are  C CPTs),  the  total  number  of  CSP  compilations  possible 
is  Pc . 

Notation  We  will  use  the  notation  hij  to  mean  the  jth  CSP 
compilation  of  the  ith  CPT.  Let  A indicate  a complete  or 
partial  assignment  to  the  variables.  If  A is  an  assignment 
that  instantiates  all  the  variables  of  CPTi,  then  we  will  use 
the  notation  hij  (A)  to  indicate  whether  or  not  A satisfies 
hij.  If  A is  a complete  assignment  for  all  the  variables  in 


the  network,  then  all  variables  for  all  CPTs  are  instantiated 
and  we  will  use  the  notation  CSP^kltka—kc)(A)  to  indicate 
whether  A satisfies  all  the  constraints  hiki  (1  < i < C).  If 
A is  not  a complete  assignment  for  all  the  variables,  then  we 
will  use  the  notation  #CSP(kt,k2—kc)(A)  to  indicate  the 
number  of  solutions  to  the  (ki,k2  ■ • - kc)  CSP  compilation 
of  the  network  that  share  the  same  partial  assignment  as  A. 
Theorem  1 The  probability  of  a complete  assignment 
A = (Xi  — Xi  - • ■ Xn  = xn)  is  just  the  sum  of  the 
weights  of  the  different  CSP  compilations  of  the  network 
that  are  satisfied  by  this  complete  assignment.  That  is, 

P(A)  = ZihlM...ho)CSP(hlM...ko)iA)  2-<*1+*-*> 
(for  all  1 < i < C,  1 < kt  < P). 

Proof  Consider  the  complete  assignment  A = (Xi  = 
x\  ■ ■ -Xn  = xn)  for  all  the  variables.  The  probability  of  this 
assignment  is  equal  to  the  product  of  the  probabilities  defined 
locally  by  each  CPT.  Now  using  the  fact  that  the  tth  bit  in  the 
binary  representation  of  this  local  value  has  been  written  out 
as  an  allowed  or  disallowed  tuple  in  the  tth  CSP  compilation 
of  that  CPT,  we  can  rewrite  the  local  value  for  A in  a CPT 
as  Y^j= i hij(A)2~f  The  total  probability  is  then  just  the 

product  over  all  local  values  = hkj(A)2~f 

Expanding  the  product,  we  see  that  each  term  is  essentially 
of  the  form  £(fel;fe2...fec)  2-^+k2"k^ILf=1hjkj  (A)  = 

E(fcllfe2...fec)  2-(fel+fe2-fe-)C'5F(fel,fc2...fee)(A). 


Theorem  2 (Model  Counting)  The  marginalized  prob- 
ability of  a partial  assignment  A to  a set  of  variables 
S C F is  equal  to  the  product  of  the  weight  and  the 
number  of  solutions  (under  the  same  partial  assignment  A) 
summed  over  all  CSP  compilations  of  the  network.  That  is, 
P(A)  = T,{kuk2...kc)#CSP{klM...kc){A)  2-(*i+fa-*c) 
(for  all  1 < i < C,  1 < h < P). 

Proof  From  the  previous  theorem,  we  know  that 
the  probability  of  a complete  assignment  B is 
E{kuk2...kc)  CSP{klM...kc){B) 2-C^-M  (for  all 
1 < i < C.  1 < ki  < P ).  Now,  the  marginalized 
probability  of  a partial  assignment  A is  just  the  sum  of 
the  probabilities  of  all  complete  assignments  B that  agree 
with  A on  the  assignment  to  variables  in  S.  That  is, 
P(A)  = p(b)(b(b)  = A).  Using  the  result  of 

the  previous  theorem  to  expand  P(B),  we  have  P(A)  = 
Zb  Z{klM...kc)  CSP{kuk2...kc)(B) 2-(*i+*a-M(2?(5)  = 
A).  Switching  the  two  summations  and  noting  that 
Y2BCSP{ki,k2-kc)(B)(B(S)  = A)  is  the  same 
as  Y,(kuk2-kc)#CSP(MM-kc)(A),  we  get  that 
p(A)  = Z{klM...kc)  #CSP{klM...kc)(A) 

3.1  Probability-Equivalents  and  Incorporation  of 
Probabilities 

Often,  we  are  given  information  in  many  forms.  Probabilities 
are  natural  information  elements  when  there  is  an  element  of 
statistical  experience  that  we  want  to  exploit.  In  other  cases, 
constraints  may  be  the  most  appropriate  to  use.  The  general 
idea  in  our  framework  is  to  use  probabilities  when  we  explic- 
itly have  them  and  to  use  model  counting  otherwise.  We  will 
use  #(5i,  S2  • • •)  to  mean  the  number  of  consistent  models  to 
(Si  US2  • • •)  (with  respect  to  the  uninstantiated  free  variables 
in  SD).  Theorems  1 and  2 establish  that  model  counting  is 
a weaker  form  of  probabilities  and  that  probabilities  provide 
only  precision  information  over  model  counting.  Therefore,  it 
is  natural  for  us  to  use  probabilities  (to  describe  events)  when 
we  have  them  explicitly,  and  to  use  model  counting  otherwise. 
For  any  event  E,  we  use  the  expressions  and  P(B) 

almost  equivalently  — except  that  we  use  the  former  when 
we  do  not  know  P(E)  explicitly.  This  framework  allows  us 
to  reason  about  both  probabilities  and  constraints. 

Definition  (Probability  Equivalents)  The  probability  equiv- 
alent of  (f=(SD,E ) for  any  event  E is  defined  to  be 
P(E)#(SD)  when  P(E ) is  given  explicitly. 

4 Diagnosis  as  Model  Counting 

In  this  section,  we  characterize  diagnoses  based  on  model 
counting.  We  will  then  show  how  all  the  previous  approaches 
are  captured  under  this  formalization.  For  the  first  part  of 
the  discussion  we  will  consider  only  complete  diagnoses  (an 
assignment  of  modes  for  all  the  components). 

Definition  (Model  Counting  Characterization)  A diagnosis 
is  a candidate  H that  maximizes  the  number  of  consistent 
models  to  SD  U OBS  U H using  probability  equivalents 
wherever  necessary. 

Notation  We  will  use  M(H)  to  denote  (f(SD,OBS,H ) 
(the  number  of  consistent  models  to  SD  U OBS  U H ) when 


SD  and  OBS  follow  from  context. 

Theorem  3 (Capturing  Consistency-Based  Diagnosis) 
Consistency-Based  diagnosis  is  looking  for  a hypothesis  H 
for  which  M (H)  is  non-zero. 

Proof  By  definition,  consistency-based  diagnosis  chooses  H 
such  that  SD  U OBS  U H is  consistent.  In  other  words,  there 
exists  at  least  one  satisfying  assignment  for  SD  U OBS  U H. 
Clearly,  this  is  equivalent  to  saying  that  M (H)  is  non-zero. 
Theorem  4 (Capturing  Abduction)  Abduction  chooses  a 
hypothesis  H that  maximizes  M ( H ) assuming  uniformity  in 
prior  probabilities  P(H). 

Proof  The  maximum  value  of  (f(SD,OBS  = I U 0,H) 
is  (f(SD , H,  I)  and  this  happens  when  H U SD  U I -»  O. 
Given  that  the  input  variables  are  controlled  externally,  we 
know  that  jf(SD,H)  = N(I)#(SD,H,I).  Here,  iV(J) 
is  a constant  that  measures  the  number  of  different  values 
for  the  input  variables.  Since  (f(SD,H ) is  equivalent  to 
P(H)jf(SD)  which  we  assumed  to  be  a constant  for  all 
H , maximizing  (f(SD,OBS,H)  is  equivalent  to  finding 
a hypothesis  H for  which  I — ► O (under  SD).  The  fact 
that  abduction  requires  H to  be  consistent  is  also  captured, 
because  if  H is  inconsistent,  then  M(H)  = 0 and  clearly 
M (H)  will  not  be  maximized. 

Theorem  5 (Capturing  Bayesian  Model  Selection)  Bayesian 
model  selection  chooses  a hypothesis  H such  that  it  maxi- 
mizes the  probability  equivalent  of  M (H). 

Proof  The  probability  equivalent  of  M ( H ) = 

#(SD,OBS,H)  is  P(OBS,  H).  Clearly,  if  we  are 
maximizing  P(OBS,H)  then  we  are  maximizing 
P(H/OBS)P(OBS).  Since  P(OBS)  is  independent 
of  H,  it  is  equivalent  to  maximizing  P(H/OBS ) which  is 
exactly  what  Bayesian  model  selection  does. 

Theorem  6 (Capturing  Combinatorial  Optimization)  Com- 
binatorial optimization  is  looking  for  a hypothesis  H which 
maximizes  P(H)  under  the  condition  that  M(H)  is  non- 
zero. 

Proof  As  noted  earlier,  H is  consistent  with  SD  U OBS 
if  and  only  if  M(H)  is  non-zero.  We  also  know  that 
combinatorial  optimization  is  looking  for  a consistent  H 
which  maximizes  P(H).  The  theorem  follows  as  a simple 
consequence  of  the  above  two  statements.  Basically,  combi- 
natorial optimization  maximizes  only  the  prior  probabilities 
of  hypotheses  (instead  of  maximizing  the  equivalent  of  the 
posterior  probabilities)  unless  they  are  obviously  ruled  out 
by  being  inconsistent  . 

4.1  Consequences  (Removing  Previous 
Shortcomings) 

We  now  show  the  consequences  of  formalizing  diagnosis  as 
model  counting.  In  particular,  we  identify  problems  with  pre- 
vious approaches  and  show  how  model  counting  removes  all 
of  them. 

Problems  with  Consistency-Based  Diagnosis 

One  of  the  problems  with  consistency-based  diagnosis  is  that 
it  allows  for  non-intuitive  hypotheses  as  diagnoses.  It  pro- 
vides only  for  a necessary  but  not  a sufficient  condition  on 
the  hypotheses  that  can  be  qualified  as  diagnoses.  By  itself,  it 
is  of  little  value  unless  we  use  an  elaborate  set  of  fault  models 


to  remove  non-intuitive  hypotheses  that  could  otherwise  be 
consistent.  Model  counting  removes  these  problems  because 
of  its  ability  to  merge  and  incorporate  the  notions  of  both 
consistency  and  probabilities.  In  one  sense,  one  can  think  of 
model  counting  as  giving  us  a measure  of  the  degree  to  which 
a hypothesis  is  consistent  with  SD  and  OBS.  Some  of  these 
problems  are  alternatively  addressed  in  [Kohlas  et  al.,  1998] 
and  [Lucas,  2001]. 

Problems  with  Fault  Models 

The  problem  with  fault-models  is  that  of  over-restriction  (as 
explained  at  the  beginning  of  the  paper).  We  need  to  be  able 
to  reason  not  only  about  constraints  relating  SD  and  OBS , 
but  also  about  any  other  kind  of  information  we  may  have 
in  the  form  of  probabilities  etc.  The  over-restriction  problem 
can  be  removed  by  introducing  probabilities.  These  proba- 
bilities can  then  be  used  in  the  unified  framework  of  model 
counting. 

Problems  with  Abduction 

Like  the  consistency-based  approaches,  explanatory  diag- 
noses are  also  unable  to  incorporate  and  reason  about  proba- 
bilities. Yet  another  problem  with  abduction  is  that  it  assumes 
we  have  completely  modeled  all  cause-effect  relationships  in 
our  system.  This  contradicts  our  modeling  incompleteness 
assumption  and  is  an  unnecessary  restriction  on  SD.  Model 
counting  removes  this  problem  in  a way  very  similar  to  how 
probabilities  were  used  to  deal  with  the  modeling  incomplete- 
ness assumption.  Alternate  treatments  for  these  problems 
can  be  found  in  [Poole,  1994]  (which  links  abduction  with 
probabilistic  reasoning)  and  [Console  etal.,  1989]  (which  ad- 
dresses the  modeling  incompleteness  assumption). 

Problems  with  Diagnosis  as  Bayesian  Model  Selection 

Bayesian  model  selection  agrees  with  our  characterization  of 
diagnoses  — but  the  only  problem  it  poses  is  that  it  requires 
SD  to  be  in  the  form  of  a Bayesian  network  with  known  prob- 
abilities. Modeling  a physical  system  as  a Bayesian  network 
is  in  many  cases  a non-intuitive  thing  to  do.  This  is  especially 
so  when  certain  probability  terms  are  hard  to  get.  Parts  of 
the  system  may  be  better  expressed  in  the  form  of  constraints 
or  automata.  In  such  cases,  Bayesian  model  selection  does 
not  extend  in  a natural  way  and  model  counting  is  the  right 
substitute  (because  it  is  defined  under  all  frameworks). 

Problems  with  Diagnosis  as  Combinatorial  Optimization 

One  problem  associated  with  casting  diagnosis  as  a combi- 
natorial optimization  problem  is  that  of  being  unable  to  give 
explanatory  diagnoses  a preference  over  the  rest.  Clearly,  we 
would  like  to  prefer  hypotheses  that  not  only  maximize  the 
prior  probability  P(H)  but  that  are  also  explanatory  rather 
than  just  being  consistent  with  SDuOBS.  One  way  to  incor- 
porate this  preference  is  to  find  all  consistent  hypotheses  that 
maximize  P(H)  and  to  pick  an  explanatory  one  among  them. 
The  question  that  arises  then  is  how  we  would  compare  two 
hypotheses  one  of  which  is  explanatory  and  the  other  just  con- 
sistent (but  not  explanatory),  with  the  latter  having  a slightly 
better  prior  probability.  This  question  is  left  unanswered  un- 
der the  combinatorial  optimization  formulation  of  diagnoses. 
In  the  model  counting  framework  however,  it  is  easy  to  see 


that  we  really  have  to  maximize  P(H)  . The 

second  factor  is  maximized  for  explanatory  diagnosis  — but 
this  is  as  much  as  the  preference  we  attach  for  them. 

Another  problem  with  the  combinatorial  optimization  for- 
mulation is  that  probabilities  are  restricted  to  only  behavior 
modes  of  components  and  only  these  prior  probabilities  are 
maximized.  There  is  no  framework  to  reason  about  proba- 
bilistic information  connected  with  observation  variables. 

5 Partial  Diagnoses 

Sometimes,  we  are  interested  in  finding  a suitable  assignment 
of  modes  to  a specified  subset  S of  the  components  COM  PS 
rather  than  for  all  components.  We  argue  that  our  characteri- 
zation of  diagnoses  under  the  model  counting  framework  re- 
mains unchanged. 

Definition  (Candidate)  Given  a set  of  integer  tuples 
(ki,ikl) " ' ' ( kn,ikn ) such  that  for  1 < j < n < \COMPS\, 
1 <ikj  < |Mj|,  a candidate  Cand((ki,ikl)  •••  (kn,  ikn))  is 
defined  as  Cand((ki,ikl)  ikn ))  = ([Tg=i(compg  = 

Mg(ikg))). 

Notation  When  the  indices  are  implicit  or  arbitrary,  we  will 
use  the  notation  Jg  to  denote  a candidate  or  a hypothesis 
i.e.  a set  of  mode  assignments  to  all  the  components  in 
5 C COMPS. 

Definition  (Model  Counting  Characterization)  A partial  di- 
agnosis for  S C COMPS  is  an  assignment  of  modes  Jg  to 
the  components  in  S that  maximizes  fl(SD,  OBS , Jg ) using 
probability  equivalents  wherever  necessary. 

It  is  now  not  hard  to  verify  that  all  previous  approaches  are 
captured  in  a way  very  similar  to  that  for  complete  diagnoses. 
This  is  essentially  a consequence  of  the  theorem  that  relates 
the  number  of  consistent  models  for  ( SD,OBS , Jg)  to  the 
marginalized  probability  of  Jg  (Theorem  2).  Instead  of  pre- 
senting the  proofs  again  (and  making  repetitive  arguments), 
we  choose  to  allude  to  another  set  of  characterizations  mostly 
associated  with  consistency-based  diagnosis.  These  are  the 
notions  of partial  (a  different  characterization  in  consistency- 
based  diagnosis),  kernel  and  prime  diagnoses.  These  notions 
have  the  same  kind  of  drawbacks  associated  with  the  general 
consistency-based  framework  [de  Kleer  et  al.,  1992]  and  our 
investigation  into  these  notions  is  just  in  the  spirit  of  under- 
standing their  relationship  to  model  counting. 

Definition  An  AB— literal  is  AB(c)  or  -i AB(c)  for  some 
component  c in  COMPS.  An  AB— clause  is  a disjunc- 
tion of  AB— literals  containing  no  complementary  pair  of 
AB— literals. 

Definition  A conflict  of  (SD, COMPS, OBS)  is  an 
AB— clause  entailed  by  SD  U OBS.  A minimal  conflict  of 
(SD,  COMPS,  OBS)  is  a conflict  no  proper  sub-clause  of 
which  is  a conflict  of  (SD,  COMPS,  OBS). 

Definition  (Consistency-Based  Characterization)  The  partial 
diagnoses  of  (SD, COMPS,  OBS)  are  the  implicants  of  the 
minimal  conflicts  of  (SD,  COMPS,  OBS). 

Theorem  7 A partial  diagnosis  in  the  consistency-based 
framework  identifying  an  implicant  T of  the  minimal 
conflicts  of  SD  U OBS,  is  also  a partial  diagnosis 
in  the  model-counting  framework  maximizing  M(Jg)  = 
fl(SD,  OBS,  Jg)  for  S = variables  of  the  implicant  T,  but 


with  free  variables  limited  to  abnormality  ( AB ) variables. 
Proof  The  implicant  T fixes  an  assignment  for  the  compo- 
nents in  S but  leaves  COMPS  \ S unassigned.  Let  the 
set  of  minimal  conflicts  of  SD  U OBS  be  w.  Let  #ab(E) 
denote  the  number  of  consistent  models  of  E restricted  to 
free  variables  being  from  the  uninstantiated  AB— variables. 
Since  T is  an  implicant  of  7r,  all  models  of  T (restricted  to 
AB— variables)  also  satisfy  w and  are  hence  consistent  with 
SD  U OBS.  This  makes  #ab(SD,OBS,T)  = #ab(T). 
In  general,  since  #ab(SD,OBS,T)  is  upper  bounded  by 
#ab(T),  the  truth  of  the  theorem  follows. 

Definition  (Consistency-based  Characterization)  A kernel 
diagnosis  identifies  the  prime  implicants  of  the  minimal  con- 
flicts of  SD  U OBS. 

Without  a detailed  discussion  (due  to  lack  of  space),  we 
claim  that  this  notion  is  related  to  yet  another  task  in  diagno- 
sis — that  of  “representing”  complete  diagnoses.  This  task 
is  orthogonal  to  “characterizing”  them  [Kumar,  2002],  There 
are  other  notions  of  diagnosis  called  prime  diagnoses,  irre- 
dundant  diagnoses  etc.  [de  Kleer  et  al.,  1992]  arising  mostly 
out  of  the  task  of  “representation”  and  all  of  which  are  cap- 
tured in  one  or  the  other  way  by  the  model  counting  frame- 
work (which  we  omit  in  this  paper). 

6 Related  Work  on  Characterizing  Diagnoses 
and  Model  Counting 

Related  work  in  trying  to  unify  model-based  and  probabilis- 
tic approaches  can  be  found  in  [Poole,  1994],  [Kohlas  et  al., 
1998],  [Lucas,  1998]  and  [Lucas,  2001].  [Poole,  1994]  links 
abductive  reasoning  and  Bayesian  networks  and  general  diag- 
nostic reasoning  systems  with  assumption-based  reasoning. 
[Kohlas  et  al.,  1998]  shows  how  to  take  results  obtained  by 
consistency  based  reasoning  systems  into  account  when  com- 
puting a posterior  probability  distribution  conditioned  on  the 
observations  (the  independence  assumptions  are  lifted  in  [Lu- 
cas, 2001]).  [Lucas,  1998]  gives  a semantic  analysis  of  differ- 
ent diagnosis  systems  using  basic  set  theory.  The  issue  of  the 
modeling  incompleteness  assumption  is  referred  to  in  [Con- 
sole et  al.,  1989]. 

Diagnosis  algorithms  based  on  model  counting  have  not 
yet  been  developed.  However,  the  problem  of  model  count- 
ing itself  has  been  extensively  dealt  with.  Although  this  prob- 
lem is  #P-cornplete,  there  are  a variety  of  techniques  that 
have  been  used  to  make  it  feasible  in  practice  (including  ap- 
proximate counting  algorithms  running  in  polynomial  time, 
structure-based  techniques  etc.).  Model  counting  for  a SAT 
instance  in  DNF  (disj  unctive  normal  form)  is  simpler  than  it  is 
for  CNF  (conjunctive  normal  form).  For  DNF,  there  is  a fully 
polynomial  randomized  approximation  scheme  (FPRAS)  to 
estimate  the  number  of  solutions  [Karp  et  al. , 1989].  CDP  and 
DDP  are  two  model-counting  algorithms  for  SAT  instances  in 
CNF  [Bayard  and  Pehoushek,  2000],  A version  of  RELSAT 
has  also  been  used  to  do  model  counting  on  SAT  instances  in 
CNF.  If  a propositional  theory  is  in  a special  form  called  the 
smooth,  deterministic,  decomposable,  negation  normal  form 
(sd-DNNF),  then  model  counting  can  be  made  tractable  and 
incremental  [Darwiche,  200 1 ]. 


7 Summary  and  Future  Work 

In  this  paper,  we  provided  a unifying  characterization  of  diag- 
noses based  on  the  idea  of  model  counting.  In  the  process,  we 
compared  and  contrasted  our  formalization  with  the  previous 
approaches  — in  many  cases,  removing  the  problems  asso- 
ciated with  them.  Because  model  counting  bridges  the  gap 
between  probabilities  and  constraints  and  is  well-defined  for 
many  representational  forms  of  information  available  about 
the  system,  we  believe  that  the  model  counting  characteri- 
zation of  diagnoses  is  useful  and  general  in  the  sense  of  not 
imposing  any  restrictions  on  the  representational  form  of  the 
system  description. 

As  for  our  future  work,  we  are  in  the  process  of  investi- 
gating and  developing  computationally  tractable  algorithms 
based  on  the  model  counting  characterization  of  diagnoses. 
Advances  in  model  counting  algorithms  (approximate  count- 
ing, structure-based  methods  etc.)  seem  to  be  encouraging 
towards  this  goal.  We  are  also  working  on  variants  of  the 
diagnosis  problem  (e.g.  when  we  are  interested  in  a set  of 
candidate  hypotheses  rather  than  just  one). 
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